Dynamics on Networks

Because of Python's flexibility, it is not only easy to do network analysis using NetworkX, but with a minimal amount of code we can simulate dynamics on networks.

In this section we'll simulate disease dynamics on NetworkX graphs. This last section is optional, you can choose to work on it, or to try to start doing your own work using NetworkX. I'll be available to help.

SIR Dynamics

In this section you will implement and describe SIR dynamics on a network. You should implement the dynamics as follows:

Mark all nodes as susceptible
Select a single node to begin the infection.
While there are infected nodes \textbf{do}:
- For each infected node $u$ in the previous step
  1. For each neighbor of $u$, $v$
    - If $v$ is susceptible set $v$ to infected with probability $\beta$
  2. Set $u$ to recovered

Note here that we are assuming that nodes are infected for exactly one time step before they move into a recovered state, i.e. $\gamma=1$. Below is the outline of the function



In [4]:

    
import networkx as nx



In [ ]:

    
import numpy as np



In [5]:

    
def simulate_SIR(G,beta,n0=None):
    if n0 is None:
        n0 = [random.choice(G.nodes())]
    infected = [set(n0)]
    recovered = set([])
    t = 1
    while True:
        #
    return infected,recovered

There are three graphs stored in the data folder that correspond to three different real world networks. They are stored as edgelists so use the appropriate function to read them

celegans.g: A metabolic network of the nematode Caenorhabditis elegans[1]
jazz.g: A network of collaborations among Jazz Musicians.[2]
ISPDec2014.g: A network of connections between ISPs, taken during the last week of December 2014.[3]

[1] Duch, Jordi, and Alex Arenas. "Community detection in complex networks using extremal optimization." Physical review E 72.2 (2005): 027104.

[2] Gleiser, Pablo M., and Leon Danon. "Community structure in jazz." Advances in complex systems 6.04 (2003): 565-573.

[3] Edwards, Benjamin, et al. "Analyzing and Modeling Longitudinal Security Data: Promise and Pitfalls." Proceedings of the 31st Annual Computer Security Applications Conference. ACM, 2015.

Infections over time

For each graph, simulate the SIR dynamics starting with a single node and record the fraction of nodes in the network infected over time for three different values of $\beta$. Because the SIR dynamics are stochastic, you will have to simulate each infection multiple times. Plot the average of these runs over time as well as the standard errors as error bars for each value of $\beta$.

Infection Properties

Next investigate you will investigate the behavior of infection spread for various values of $\beta$. Select at least 20 different values of $\beta$. Simulate the SIR dynamics on the network starting with a random node, measuring the total proportion of the network that becomes infected. Be sure to simulate the infection enough times that you can reasonably estimate mean and standard deviation of each of these measures (at least 100). For each measure, make a plot of the $\beta$ values and the measure for each of the three networks. Include the mean and error bars for the standard deviation. Report on what each measure tells us about the three different networks.

Influential Spreaders

It might be important to know which nodes in the network are most capable of spreading disease. This may be important to identify the best way to stop the spread of infections or the best way to spread information in a network. We will measure how influential a node is by measuring the average proportion of the network which becomes infected when the infection starts with that node. Using $\beta=0.2$ measure the mean infection size when started from each node in each network. Once again be run at least 100 simulations for each node. In a table, report the most and least influential nodes, and the average size of the infection they create.

Network Measures to Identify Influential Spreaders

Rather than run simulations it may be useful to use other network measures to identify influential spreaders in the network. Using the software package of your choice compute each of the following measures for each node in the network:

Degree: Number of edges each node has.
Average Shortest Path Length: For each node $u$ compute the shortest path to all other nodes $v$, and take the average of their lengths
Betweenness Centrality: The fraction of all shortest paths a node in the network participates in.

For each network, make a scatter plot where each point is each the $(x,y)$ pair, where $x$ is each of the above measures for a single node, and $y$ is the average infection size when the infection starts at that node.

Which of the measures provides the best prediction of infection size? Why does each perform well or poorly? Investigate other network measures and speculate why they might be better at identifying influential spreaders. Be sure to cite your sources.



In [ ]: